<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html
     PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>LingPipe: Competition</title>
<meta http-equiv="Content-type"
      content="application/xhtml+xml; charset=utf-8"/>
<meta http-equiv="Content-Language"
      content="en"/>
<link href="css/lp-site.css"
      title="lp-site"
      type="text/css"
      rel="stylesheet"
      media="screen,projection,tv" />
<link href="css/lp-site-print.css"
      title="lp-site-print"
      type="text/css"
      rel="stylesheet"
      media="print,handheld,tty,aural,braille,embossed"/>
</head>

<body>

<div id="header">
<h1 id="product">LingPipe</h1><h1 id="pagetitle">Competition</h1>
<a id="logo"
   href="http://alias-i.com/"
  ><img src="img/logo-small.gif" alt="alias-i logo"/>
</a>
</div><!-- head -->


<div id="navig">

<!-- set class="current" for current link -->
<ul>
<li><a href="../index.html">home</a></li>

<li><a href="demos.html">demos</a></li>

<li><a href="licensing.html">license</a></li>

<li>download
<ul>
<li><a href="download.html">lingpipe core</a></li>
<li><a href="models.html">models</a></li>
</ul>
</li>

<li>docs
<ul>
<li><a href="install.html">install</a></li>
<li><a href="../demos/tutorial/read-me.html">tutorials</a></li>
<li><a href="../docs/api/index.html">javadoc</a></li>
<li><a href="book.html">textbook</a></li>
</ul>
</li>

<li>community
<ul>
<li><a href="customers.html">customers</a></li>
<li><a href="http://groups.yahoo.com/group/LingPipe/">newsgroup</a></li>
<li><a href="http://lingpipe-blog.com/">blog</a></li>
<li><a href="bugs.html">bugs</a></li>
<li><a href="sandbox.html">sandbox</a></li>
<li><a class="current" href="competition.html">competition</a></li>
<li><a href="citations.html">citations</a></li>
</ul>
</li>

<li><a href="contact.html">contact</a></li>

<li><a href="about.html">about alias-i</a></li>
</ul>

<div class="search">
<form action="http://www.google.com/search">
<p>
<input type="hidden" name="hl" value="en" />
<input type="hidden" name="ie" value="UTF-8" />
<input type="hidden" name="oe" value="UTF-8" />
<input type="hidden" name="sitesearch" value="alias-i.com" />
<input class="query" size="10%" name="q" value="" />
<br />
<input class="submit" type="submit" value="search" name="submit" />
<span style="font-size:.6em; color:#888">by&nbsp;Google</span>
</p>
</form>
</div>

</div><!-- navig -->


<div id="content" class="content">

<h2>LingPipe's Competition</h2>



<div class="sidebar">
<h2>Contributing to this Page</h2>
<p>
If you know of a natural language toolkit that's not listed on
this page, or if you have a clarification or correction for what
we list, please <a href="contact.html">contact us</a>.
</p>
</div>

<p>On this page, we break our competition down into academic
toolkits and industrial toolkits.  We only consider software
that is available for linguistic processing, not companies that
rely on linguistic processing in an application but do not
sell that technology.
</p>

<p>How does LingPipe compare to the below offerings? A few key points
to keep in mind as you browse the offerings:
</p>

<ul>

<li>We are a Geek2Geek business. Nearly every sale we have ever
made was started by a programmer with a problem to solve.
</li>

<li>We are dedicated to making it easier to use linguistics in
applications. We know that using linguistics requires a different view
of computation and want to make it part of a well rounded developer's
skill set. How?

<ul>
  <li>We document extensively, write tutorials, build demos, and
        then do it some more.
  </li>
  <li>We present LingPipe to Java groups and to students.</li>
  <li>We sponsor hobby nights at our office in Brooklyn.</li>
  <li>We release the source code. The Royalty Free License makes it
      maximally easy to prototype and experiment.</li>
</ul>
</li>

<li>We claim no "magic pixie dust" proprietary algorithms. Our goal is
to provide industrial strength implementations of proven, well
understood technologies. That said, LingPipe is used a good deal in
research systems as a foundation over which to explore new ideas.</li>

<li>We have free support on the Yahoo! group LingPipe and offer paid support
as well.</li>

</ul>


<h2>Academic and Open Source Competition</h2>

<div class="sidebar">
<h2>Goals of Academic Systems</h2>
<p>
Most of the academic systems have been put together with an emphasis on
accuracy.  This is typically derived from state-of-the-art
machine learning and/or inference algorithms.  While accurate,
these methods tend to be much slower to train and decode
than the commercial systems such as LingPipe.
</p>
<p>
Some of the academic systems, such as NLTK, were defined primarily as
teaching aids.  Others, such as OpenNLP and GATE, were developed in
order to provide baseline tools to a large community.
</p>
</div>

<p>The following is a list of ongoing large-scale, multi-function
natural language toolkits that are built and distributed by academics.
</p>

<div class="sidebar">
<h2>One-off Academic Efforts</h2>
<p>
There are hundreds of software packages released by academics that
perform either a single task or illustrate a single point.  We have
compiled a list of lists of these at the bottom of the page in the:
</p>
<ul>
<li><a href="#other">Other Lists of NLP Tools Section</a></li>
</ul>

</div>

<div class="sidebar">
<h2>Search, Speech, Translation, OCR, ...?</h2>
<p>
We have intentionally not listed competitors focused on things
other than basic language processing tools.
</p>
<p>Companies in these businesses are more likely to
be LingPipe customers than LingPipe competitors.
</p>
</div>

<h3><a href="http://www.cs.wisc.edu/~bsettles/abner/">ABNER</a></h3>

<p>ABNER is a statistical named entity recognizer &quot;using linear-chain conditional random fields (CRFs) with a variety of orthographic and contextual features. Version 1.5 includes two models trained on the NLPBA and BioCreative corpora, for which performance is roughly state of the art.  The new version also includes a Java API allowing users to incorporate ABNER into their systems, as well as train and use models for other data.&quot;  Written by Burr
Settles out of University of Wisconsin-Madison.    Released with source with
the <a href="http://www.opensource.org/licenses/cpl1.0.php">Commons Public License</a>.
</p>

<h3><a href="http://balie.sourceforge.net/">BALIE</a></h3>
<p>
The Baseline Information Extraction (BALIE) system is a Java natural
language toolkit developed at the University of Ottawa and released
under the GNU <a
href="http://www.gnu.org/copyleft/gpl.html">General Public
License</a>.  BALIE provides language ID, sentence detection,
tokenization and named-entity recognition.  Here's the <a href="http://balie.sourceforge.net/doc/">BALIE javadoc</a>.
</p>


<h3><a href="http://banner.sourceforge.net/">BANNER</a></h3>

<p>&quot;BANNER is a named entity recognition system, primarily intended for biomedical text. It is a machine-learning system based on conditional random fields and contains a wide survey of the best features in recent literature on biomedical named entity recognition (NER). BANNER is portable and is designed to maximize domain independence by not employing semantic features or rule-based processing steps. It is therefore useful to developers as an extensible NER implementation, to researchers as a standard for comparing innovative techniques, and to biologists requiring the ability to find novel entities in large amounts of text.

BANNER is released under the <a href="http://www.opensource.org/licenses/cpl1.0.php">Common Public License</a>.&quot;</p>


<h3><a href="http://garraf.epsevg.upc.es/freeling/">FreeLing</a></h3>
<p>
FreeLing is a set of C++ tools developed at the Universitat
Politècnica de Catalunya and released under the GNU <a
href="http://www.fsf.org/licenses/lgpl.html">Lesser General Public
License</a>.  Freeling provides sentence detection, morphological
analysis, named entities, POS tagging, shallow parsing, dependency
parsing and word sense disambiguation.  Here's a link to their <a
href="http://garraf.epsevg.upc.es/freeling/doc/userman/html/">user
manual</a>.
</p>


<a name="dragon"/>
<h3><a href="http://dragon.ischool.drexel.edu/default.asp">The Dragon Toolkit</a></h3>
<p>&quot;The Dragon Toolkit is a Java-based development package for academic use in information retrieval (IR) and text mining (TM, including text classification, text clustering, text summarization, and topic modeling). It is tailored for researchers who work on large-scale IR and TM and prefer Java programming.&quot;
</p>

<h3><a href="http://www.ellogon.org/">Ellogon</a></h3>
<p>
&quot;Ellogon is different from other similar software. First of all, it respects the user's time by offering a simple and user friendly graphical interface. But beneath this simple appearance a powerful engine is hidden, that has been proved to be able to support a wide range of uses, from simple research prototypes to commercial applications.  Ellogon is licensed under the GNU LGPL license, is easy to install and administer and is reliable. Running under all major operating systems, Ellogon offers a comfortable environment for computational linguists, language engineers or plain users.&quot;
</p>

<a name="gate"/>
<h3><a href="http://gate.ac.uk/">GATE</a></h3>
<p>GATE is a Java text mining toolkit developed at the University of
Sheffield and released under the <a href="http://gate.ac.uk/gate/licence.html">GNU
Lesser General Public License</a>.  GATE
provides a general offset-oriented development/deployment
environment/framework and some rule-based tools to run within that framework.
Many other <a href="http://gate.ac.uk/gate/doc/plugins.html">GATE plugins</a>
have been contributed by Sheffield and third parties.
Here is a link to their <a href="http://gate.ac.uk/sale/tao/index.html">user guide</a> and <a href="http://www.gate.ac.uk/releases/gate-3.1-build2270-ALL/doc/javadoc/index.html">javadoc</a>.
</p>

<h3><a href="http://www.julielab.de/">JULIE</a></h3>
<p>
&quot;The JULIE Lab here offers a comprehensive NLP tool suite for the application purposes of semantic search, information extraction and text mining. Most of our continuously expanding tool suite is based on machine learning methods and thus is domain- and language independent.
</p>
<p>
One main feature is that we offer our tools both as stand-alone programs and wrapped within the UIMA framework. UIMA is an open-source , industrial-strength, scaleable and extensible platform for creating, integrating and deploying unstructured information management solutions from combinations of semantic analysis and search components. &quot;
</p>

<h3><a href="http://lucene.apache.org/mahout/">Apache Lucene Mahout</a></h3>
<p>
Mahout's goal is to build scalable, Apache licensed machine learning
libraries. Initially, we are interested in building out the ten
machine learning libraries detailed in [Chu et al.'s 2006 NIPS paper
<a href="http://books.nips.cc/papers/files/nips19/NIPS2006_0725.pdf">
Map-Reduce for Machine Learning on Multicore</a>]
using <a href="http://hadoop.apache.org/core/">[Apache] Hadoop</a>.&quot;
</p>

<h3><a href="http://mallet.cs.umass.edu/index.php/Main_Page">MALLET</a></h3>
<p>
MALLET is a Java natural language toolkit developed at the University
of Massachussetts and released under the <a
href="http://www.opensource.org/licenses/cpl1.0.php">Common Public
License</a>.  MALLET is most widely used for classification and
sequence modeling. It also includes clustering.  It provides maximum entropy
training, including conditional random fields, general undirected
graphical models, finite-state transducers, and some general numerical
optimization classes.  Here's a link to their <a
href="http://mallet.cs.umass.edu/api/">javadoc</a>.
</p>


<h3><a href="http://maltparser.org/">MaltParser</a></h3>

<p>&quot;MaltParser is a system for data-driven dependency parsing, which can be used to induce a parsing model from treebank data and to parse new data using an induced model. MaltParser is developed by Johan Hall, Jens Nilsson and Joakim Nivre at Vaxjo University and Uppsala University, Sweden.&quot;</p>


<h3><a href="http://www.mindswap.org/2005/SMORE/">MINDSWAP</a></h3>

<p>Maryland Information and Network Dynamics Lab Semantic Web Agents Project offers a generic Semantic Web anno toolkit, SMORE, &quot;designed to enable users to markup HTML documents in OWL using Web Ontologies.&quot;</p>


<h3><a href="http://sourceforge.net/apps/trac/minorthird/wiki">Minor Third</a></h3>
<p>
Minor Third is a Java natural language toolkit developed at Carnegie
Mellon University and released under the <a
href="http://www.opensource.org/licenses/bsd-license.php">BSD
License</a>.  Minor Third provides an extensive suite of general
and sequence classifiers, including KNN, active learning, SVMs,
decision trees, CRFs, CMMs, boosting, perceptrons, etc.
</p>

<h3>MontyLingua</h3>
<p>
&quot;MontyLingua is a free for research use, commonsense-enriched,
end-to-end natural language understander for English. Feed raw English
text into MontyLingua, and the output will be a semantic
interpretation of that text. Perfect for information retrieval and
extraction, request processing, and question answering. From English
sentences, it extracts subject/verb/object tuples, extracts
adjectives, noun phrases and verb phrases, and extracts people's
names, places, events, dates and times, and other semantic
information. MontyLingua makes traditionally difficult language
processing tasks trivial!&quot;</p>

<h3><a href="http://morphadorner.northwestern.edu/morphadorner/">MorphAdorner</a></h3>

<p> An extensive package developed at Northwestern for spelling,
lemmatization, sentence detector, part-of-speech tagger, named entity
detector, and phrase chunker.  It rolls in many other projects into a
coherent and well-documented package.  Mainly aimed at historical
forms of English, but retrainable.  Very open license.  </p>


<h3><a href="http://www.nactem.ac.uk">NaCTeM</a></h3>
<p>
The National Centre for Text Mining (NaCTeM) is the first publicly-funded text mining
  centre in the world. We provide text mining services in response to the requirements
  of the UK academic community. NaCTeM is operated by the University of Manchester with close collaboration with the University of Tokyo.
</p>
<p>
On our website, you can find pointers to sources of information about
  text mining such as links to
</p>
<ul>

    <li> text mining services provided by NaCTeM</li>

    <li> software tools, both those developed by the NaCTeM team and by other text mining groups</li>

    <li> seminars, general events, conferences and workshops</li>

    <li> tutorials and demonstrations</li>

    <li> text mining publications</li>
</ul>


<h3><a href="http://nltk.sourceforge.net/">NLTK</a></h3> <p>The
Natural Language Toolkit (NLTK) is a general Python toolkit developed
at the University of Melbourne for
natural language processing released under the GNU <a
href="http://www.gnu.org/copyleft/gpl.html">General Public
License</a>.  NLTK contains modules for heuristic and statistical
tagging (including the Brill tagger) and chunking, full parsing (CFG),
and clustering (including K-means and EM).
The <a href="http://nltk.sourceforge.net/docs.html">documentation page</a> contains pointers to tutorials and API documentation.  It's also distributed with
a range of interesting data.
</p>

<h3><a href="http://opennlp.sourceforge.net/">OpenNLP</a></h3>
<p>
OpenNLP is a heterogeneous collection of <a
href="http://opennlp.sourceforge.net/projects.html">projects</a>
distributed under a variety of open source licenses.  The main
projects being developed for OpenNLP itself include a general Java <a
href="http://maxent.sourceforge.net/">maximum entropy</a> package
released under the GNU <a
href="http://www.fsf.org/licenses/lgpl.html">Lesser General Public
License</a>.  Here's the <a
href="http://maxent.sourceforge.net/api/index.html">maxent
javadoc</a>.  There's a tools API to go with it; here's the <a
href="http://opennlp.sourceforge.net/api/index.html">tools
javadoc</a>.  The tools include statistical tokenizers, sentence
detection, name finders, part-of-speech taggers and full syntactic
(PCFG) parsing. This is one of the few packages to do coreference
resolution.
</p>

<h3><a href="http://www.proxem.com">Proxem</a></h3>
<p>
Actually, it's hard to tell if this is academic or commerical.
They offer a suite of .NET tools for natural language processing
called <a href="http://www.proxem.com/Antelope/tabid/55/Default.aspx">Antelope</a>.
It looks like much of the work is based on wrappers to other packages.
</p>

<h3><a href="http://www.informatics.susx.ac.uk/research/nlp/rasp/">RASP</a></h3>
<p>
RASP is an NLP framework for &quot;robust accurate statistical
parsing&quot;.  It is trained using the British English corpora
Susanne, LOB and BNC.  RASP includes a tokenizer, part-of-speech
tagger, hand-built FST-based morphological analyzer for English,
grammar-based parser and parse reranking model.
</p>
<p>
RASP is distributed only as an executable and licensed under its own
<a href="http://www.cstit.cl.cam.ac.uk/cgi-bin/p/licence/licence?_page=bli&amp;_id=cK">non-commercial license</a> for education or research.
</p>
<p>
There is a <a href="http://acl.ldc.upenn.edu/P/P06/P06-4020.pdf">RASP
white paper</a> describing the system, including dependency parser
accuracy evaluation, and a longer <a
href="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-662.html">technical
report</a> about the grammar formalism used.
</p>

<h3><a href="http://nlp.stanford.edu/software/index.shtml">Stanford NLP Software</a></h3>
<p>
This is many pieces of software with Java source licensed under the
GNU <a href="http://www.gnu.org/copyleft/gpl.html">General Public
License</a>.  Tools include a part-of-speech tagger, text classifier,
and PCFG parser.  They no longer provide public access to their full JavaNLP
toolkit.
</p>

<h3><a href="http://www.speech.sri.com/projects/srilm/">SRI International</a></h3>
<p>
&quot;SRILM is a toolkit for building and applying statistical language models (LMs), primarily for use in speech recognition, statistical tagging and segmentation, and machine translation. It has been under development in the SRI Speech Technology and Research Laboratory since 1995. The toolkit has also greatly benefitted from its use and enhancements during the Johns Hopkins University/CLSP summer workshops in 1995, 1996, 1997, and 2002.&quot;
</p>

<h3><a href="http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/">TreeTagger</a></h3>
<p>
TreeTagger is Helmud Schmid's multilingual part-of-speech tagger.
It has a <a
href="http://www.ims.uni-stuttgart.de/~schmid/Tagger-Licence">research-only</a>.
</p>

<h3><a href="http://www.cs.waikato.ac.nz/ml/weka/">WEKA</a></h3>
<p>
WEKA is the University of Waikato's data mining software released
under the GNU <a href="http://www.gnu.org/copyleft/gpl.html">General
Public License</a>.  It's not a natural language processing toolkit,
but a very extensive general machine learning toolkit.  It provides a
nice graphical interface for evaluation and supports just about every
machine learning algorithm known to research.  It's based on Ian
Witten and Eibe Frank's book <i>Data Mining</i>. Here's a pointer to
the <a
href="http://www.cs.waikato.ac.nz/ml/weka/index_documentation.html">WEKA
documentation</a>, which includes tutorials and pointers to the
javadoc.
</p>

<h2>Industrial Competition</h2>

<p>
The following is a list of competitors with quotes from their own
web pages.  We've listed technology components where we could find them.
</p>

<h3><a href="http://www.accenture.com/Global/Services/Accenture_Technology_Labs/default.htm">Accenture Technology Labs</a></h3>
<p>
Accenture is offering a <a
href="http://www.accenture.com/Global/Services/Accenture_Technology_Labs/R_and_I/SentimentServices.htm">sentiment
monitoring service</a>, &quot;Sentiment Monitoring Services searches
preferred sites or newsgroups on the Internet for opinions. Using
advanced language technologies, it interprets the sentiment of the
text towards a specified product or service and then provides the user
with an analysis of the results. Sentiment Monitoring Services
combines a search agent and a perception engine to present users with
an instant gauge of market perception of any feature, product, brand
or organization. The natural language processor of the perception
engine achieves an accuracy of approximately 90 percent compared to
opinion ratings ranked manually.&quot; 
</p>

<h3><a href="http://adaptivesemantics.com/">Adaptive Semantics</a></h3>
<p>
&quot;ASI specializes in sentiment analysis, powered by cutting edge machine learning methods. From major blog networks, to social networks, to community reviews sites, JuLiA offers automated comment moderation, user-profiling, and a series of reporting tools to enhance and streamline the entire community moderation process.&quot;
</p>
<p>
&quot;The Huffington Post has acquired Adaptive Semantics, its first purchase of another company. HuffPo wants to use Adaptive Semantics software, which provides learning and sentiment analysis technology to continue to scale community and work in tandem with the site's team of human moderators.&quot;
</p>

<h3><a href="http://www.alifemedical.com/">A-Life Medical</a></h3>
<p> 
&quot;A-Life Medicals patented Natural Language Processing (NLP)
technology utilizes proprietary knowledge-bases of more than ten
million facts to automate the coding process. Our technology combined
with our software solutions and services are dramatically changing the
way healthcare codes, submits claims, collects reimbursement, as well
as improving patient care.&quot;
</p>
<p> A-Life produces Alacer,
&quot;the first end-to-end practice management system that integrates
document management, real-time NLP coding, billing, collections,
denials management, and auditing into one streamlined Windows-based
platform.&quot;
</p>

<h3><a href="http://alitora.com/">Alitora Systems</a></h3>
<p>
&quot;Alitora System provides comprehensive software solutions for
biotech research, management, compliance, intellectual property
management and competitive intelligence. Our software enables users to
search, annotate and collaborate, seamlessly, allowing the annotation
of information as simple as a click, and collaboration as simple as a
drag-and-drop.&quot;
</p>
<p>Alitora offers kHarmony, which
&quot;allows users to identify concepts that are of interest, and then
search for information relating specifically to those
concepts...&quot;.  Alitora describes kHarmony as  &quot;
using proprietary graph-theoretic and information retrieval
techniques to provide structure to unstructured data, perform data
clustering, and enable visual data exploration.&quot;
</p>

<h3><a href="http://www.appen.com.au/">Appen</a></h3>
<p>
&quot;Appen develops and markets sophisticated computer-based
speech and language technology products and services for major
international information and communication companies and government
organizations.&quot;
</p>
<p> Appen supplies a range of <a
href="http://www.appen.com.au/index.php/language-resources-and-services/text-corpora.html">corpora</a>,
as well as tools for <a
href="http://www.appen.com.au/index.php/tools/morphological-analyser.html">morphology</a>
and <a
href="http://www.appen.com.au/index.php/tools/sentiment-and-attitude-analysis.html">sentiment</a>,
and <a
href="http://www.appen.com.au/index.php/tools/text-attribution-tool.html">authorship</a>.
</p>

<h3><a href="http://www.ariadnegenomics.com/">Ariadne Genomics</a></h3>
<p>
&quot;Ariadne develops software tools for biologists in the areas of
pathway analysis and automated scientific text processing. Ariadne
products incorporate proprietary Natural Language Processing (NLP) and
statistical algorithms designed to functionally interpret novel
genetic information.&quot;
</p>
<p>
Although they don't really sell NLP software per se, they are making
applications like their <a
href="http://www.ariadnegenomics.com/products/medscan/reader/">Medscan
Reader</a> for text data mining over biomedical research articles.  It
uses entity extraction (e.g. genes and diseases), specific relation
extraction (e.g. binding or regulation) and sentence-level search
and summarization.
</p>

<h3><a href="http://www.attensity.com">Attensity</a></h3>
<p>
&quot;Attensity's breakthrough Text Analytics solutions enable
computers to understand and process free-form text, offering
organizations the opportunity to leverage the vast amounts of
information contained in non-structured formats. The technology allows
users to extract and analyze facts like who, what, where, why, under
what conditions and to whom, as well as opinions and events found in
unstructured data.&quot;
</p>
<p>
&quot;Attensity offers a complete suite of products for Text
Analytics. The suite includes both targeted and Exhaustive Extraction
Engines that pull the information out of text and put it into a usable
format, analysis and discovery applications that allow you to explore
and make sense of the data, knowledge libraries and knowledge
engineering tools that provide the ability to define what to extract
and categories to put data in, and an integration toolkit.&quot;
</p>


<h3><a href="http://www.autonomy.com/">Autonomy</a></h3>
<p>
&quot; Autonomy is the acknowledged leader in the rapidly growing area of Meaning Based Computing (MBC).&quot;
</p>
<p>&quot;Meaning Based Computing not only uncovers, but also makes sense of, the 85% of enterprise information that is hidden to all other technologies including keyword search engines and relational databases. ...
Meaning Based Computing enables organizations to automatically form a contextual understanding of people's interests, behavior and ongoing interaction with any type of information.
...
Meaning Based Computing enables organizations to extract meaningful evidence from terabytes of email, documents, spreadsheets and other unstructured information.&quot;
</p>



<h3><a href="http://www.basistech.com/">Basis Technology</a></h3>
<p>
&quot;Basis Technology provides software solutions for extracting meaningful intelligence from unstructured text in Asian, European and Middle Eastern languages.  We help technology companies  and government organizations improve the accuracy of information retrieval, text mining and other applications through advanced linguistics.&quot;
</p>
<p>
Basis provides entity extraction in ten languages, does language
identification, as well as Chinese and Japanese character-level support.
</p>


<h3><a href="http://www.baynote.com/">Baynote</a></h3>

<p>&quot;Baynote delivers the industry-leading recommendation engine for products and content as well as UseRank social search for websites, intranets, and portals. Our Collective Intelligence Platform allows businesses to better understand their visitors' intent and context and automatically display the best content and products based on that insight.&quot;</p>


<h3><a href="http://www.bbn.com/technology/knowledge/semantic_web_applications">BBN</a></h3>
<p>
&quot;BBN is the leader in the development of the new "Semantic Web," which will enable powerful searches and automated agents. BBN has been coordinating the work of 23 US and international research teams in conjunction with the World Wide Web Consortium and European Union collaborators to drive the transition to a semantic web.&quot;
</p>


<h3><a href="http://www.bitext.com/">Bitext</a></h3>
<p>
&quot;Providing natural language understanding to any application
We have already succeeded with search engines, databases, virtual assistants...&quot;
</p>
<p>
Bitext (for &quot;bits and text&quot;) provides NaturalFinder, &quot;the essential complement for any search engine for Internet and intranets which allows users to query in natural language (Spanish or English) without using booleans or wildcards.&quot;
</p>

<h3><a href="http://www.biz360.com/">Biz360</a></h3>
<p>
&quot;At Biz360, we are committed to removing the inefficiencies of traditional market research and measurement through technology and product innovation. Biz360 uses proprietary technology, analytics and natural language processing (NLP) to aggregate and analyze vast amounts of media and market information to yield insights which help marketing companies better understand, reach and motivate their key target audiences.&quot;
</p>
<p>
Their offerings are centered around sentiment analysis and subject
tracking through dashboard-style reporting.
</p>

<h3><a href="http://www.brainware.com/">Brainware</a></h3>
<p>
&quot;Brainware helps the world's leading companies automatically extract, process, and retrieve data from any source. Our Data Capture solutions virtually eliminate manual data entry while our Enterprise Search solutions allow you to stop searching and start finding...&quot;
</p>
<p>The technology behind Brainware's <a href="http://www.brainware.com/data_capture.php">Intelligent Data Capture Distiller</a> involves &quot;patented neural network-based classification&quot; as well as
&quot;pattern recognition technologies and &quot;fuzzy logic&quot; to accurately sort documents and extract key data fields even when fields shift positions from document to document.&quot;
</p>

<h3><a href="http://www.butlerhill.com/">Butler Hill Group</a></h3>
<p>
&quot;The Butler Hill Group is a streamlined network of linguists,
computer scientists, language experts and research librarians with
expertise in the natural language issues of computer technology. We
maintain solid relationships with highly skilled consultants ...  our
past projects include machine translation, web search, lexicon
evaluation, product usability studies, speech and product localization
...&quot;
</p>
<p>
Butler Hill primarily provides services for corpus creation,
evaluation and internationalization.
</p>

<h3><a href="http://www.cambridgesemantics.com/">Cambridge Semantics</a></h3>
<p>
&quot;Cambridge Semantics empowers even non-technical users to leverage their familiarity with Excel (using Anzo for Excel) and the simplicity of our Web visualization and forms tool (Anzo on the Web) to rapidly develop solutions that connect critical enterprise data to an information fabric, making that data available to be combined, manipulated and shared at the moment it is needed.&quot;
</p>

<h3><a href="http://company.carrot-search.com/index.html">Carrot Search</a></h3>
<p>
Carrot Search offers &quot;professional installation, customization, clustering and text mining consulting services based on Open Source and proprietary software.&quot;  They also offer <a href="http://company.carrot-search.com/lingo-applications.html">Lingo3G</a>, a
&quot;Document Clustering Engine that can organize
collections of text documents into clearly labeled thematic
groups. Accurately and on-the-fly.&quot;
</p>
<p>
They also offer the open source framework <a
href="http://project.carrot2.org/">Carrot<sup>2</sup></a>, which
provides federated search with clustering over popular search engines and
APIs, including Lucene.
</p>

<h3><a href="http://clarabridge.com/">Clarabridge</a></h3>
<p> &quot;Clarabridge's text mining software transforms text into
actionable insight to improve market research, customer care, product
development, quality assurance and risk management. Clarabridge's
award-winning software links the worlds of text mining, search and
business intelligence (BI) to enable enterprises to more quickly and
intuitively leverage all of their data to make better business
decisions.&quot;
</p>
<p>
Their offerings include data deduplication/cleansing, data linkage/merging,
document segmentation and categorization, entity extraction,
event, relationship and fact extraction, table parsing and image processing, as well
as search and visualization on top of these.
</p>

<h3><a href="http://www.clearforest.com/">ClearForest</a></h3>
<p>
&quot;ClearForest's text-driven business intelligence solutions help organizations make more informed business decisions by doing what search technologies do not--extract free text for use within analytics applications and BI systems. We provide the analytical bridge between two previously disconnected worlds of information--unstructured text and enterprise data. In allowing both to be analyzed simultaneously, ClearForest makes unified business intelligence a reality.&quot;
</p>
<p>
&quot;ClearForest Tags' open and flexible platform supports
statistical, structural and semantic tagging as well as custom
taggers, industry and custom taxonomies, and information agents.&quot;
</p>

<h3><a href="http://www.coderyte.com/">CodeRyte</a></h3>
<p>
&quot;Trust, context and confidence anchor CodeRyte's
natural language processing (NLP) technology.&quot;
</p>
<p>
&quot;The technology reads medical reports and identifies accurate CPT and ICD codes from the text of a physicians documentation.&quot;  They very helpfully point to the American
Health Information Management Association's page on <a href="http://library.ahima.org/xpedio/groups/public/documents/ahima/bok1_025099.hcsp?dDocName=bok1_025099">Delving into Computer-assisted Coding</a>.
</p>

<h3><a href="http://www.cognition.com/">Cognition</a></h3>
<p> 
&quot;CognitionSearch, the Company's patented meaning-based
linguistic Search architecture, is able to deliver significantly
higher levels of relevant search results than is possible with
currently used Search technologies.&quot;
</p>
<p>
&quot;The technology employs a unique mix of linguistics and
mathematical algorithms which has, in effect, 'taught' the computer
the meanings (or associated concepts) of nearly all the words and
frequently used ph rases within the common English language.  It also
has knowledge of the relations between words and phrases, especially
paraphrase and taxonomy.&quot;
</p>
<p>
&quot;CognitionSearch is the only commercially available technology
that combines natural langauge queries with linguistic meaning-based
Search and semantics.  It incorporates statistical algorithms with
linguistically mapped coverage of teh English language,...&quot;
</p>

<h3><a href="http://www.connexor.com/">Connexor</a></h3>
<p>
&quot;Connexor provides linguistic technologies and expertise to
software houses and solution providers who tackle the challenge of how
to derive useful information from unstructured digital text for
different kinds of consumers and analysts.&quot;
</p>
<p>
&quot;Connexor's Machinese software discovers the grammar and
semantics of natural language. Machinese enriches text with linguistic
markup: a uniform programmer's interface that enables use of text
content in software applications and solutions.&quot;
</p>
<p>
Connexor makes some excllent heuristic/rule-based part-of-speech taggers,
named entity extractors and dependency parsers.
</p>

<h3><a href="http://www.connotate.com">Connotate</a></h3>
<p>
&quot;Go Beyond Search... Access, Share and Deliver Intelligence
and Awareness, not just Links&quot;.
&quot;At Connotate we believe in working smarter, not harder.&quot;
</p>
<p>
&quot;Utilizing patented machine learning algorithms, Agents are
easily trained in minutes using a simple point-and-click process that
requires NO programming. Agents can be deployed to do anything a human
can do to mine, monitor, survey, collect, aggregate and normalize
dynamic financial content deep within the Web or in the Enterprise
into actionable business intelligence.&quot;
</p>

<h3><a href="http://contentanalyst.com/">Content Analyst</a></h3>
<p> 
&quot;Stop searching, start doing.&quot; Content Analyst supplies
a range of text analytics applications, including <a
href="http://contentanalyst.com/html/challenges/challenges_sorting.html">classification</a>,
<a
href="http://contentanalyst.com/html/tech/technologies_nametracking.html">named
entity coreference</a>, <a
href="http://contentanalyst.com/html/tech/technologies_relationship.html">relationship
discovery</a>.  
</p>
<p> 
Their technology is based on <a
href="http://contentanalyst.com/html/tech/technologies_lsi.html">latent
semantic indexing</a>, a dimensionality reduction technique based on
singular value decomposition of a matrix of co-occurrences.  The
technology was acquired when Content Analyst <a
href="http://contentanalyst.com/html/whoweare/whoweare_press_saic.html">spun
out from SAIC</a>.  
</p>

<h3><a href="http://www.crawdadtech.com/">Crawdad Technologies</a></h3>
<p>
&quot;Crawdad Technologies, LLC provides software and services to
analysts and research professionals who need to transform unstructured
text into insight.&quot;
</p>
<p>
Crawdad supplies <a
href="http://www.crawdadtech.com/html/01_software.html">Crawdad
Desktop</a>, which does scraping and some natural langauge processing
involving classification and terminology extraction.
</p>
<p>
Crawdad also builds <a
href="http://www.crawdadtech.com/html/00b_improve.html">Listening
Posts</a>, which &quot;listens to blogs, discussion boards, chat
rooms, social networking sites, and online media for news or opinion
about products, brands, celebrities, and issues. Users view a daily
dashboard which uses patented natural language processing technology
to analyze the buzz on the Web and make sense of it.&quot;
</p>

<h3><a href="http://www.digitalreasoning.com/solutions/">Digital Reasoning</a></h3>
<p>
&quot;Digital Reasoning Systems develops software solutions that rapidly process and organize unstructured data into meaningful, relevant and quantifiable knowledge automatically.  The core technology, embodied in Synthesys, can be used to provide for the automatic categorization, linking, retrieval and profiling of unstructured data.&quot;
</p>

<h3><a href="http://doloreslabs.com/index.html">Dolores Labs</a></h3>
<p>
&quot;We make crowdsourcing easy.&quot;  Dolores Labs is involved
in classification including topic and sentiment, document de-duplication,
and other natural language tasks.
</p>
<p>Dolores Labs are not so much competing as providing a
complementary service: data annotationa via crowdsourcing,
for which they've used Amazon's <a href="http://www.mturk.com/mturk/welcome">Mechanical Turk</a>.
</p>

<h3><a href="http://www.engenium.com/">Engenium</a></h3>
<p>
&quot;Engenium is a pioneer in conceptual search technology that
increases the effectiveness of electronic information
retrieval. Unlike keyword searching that is limited to precisely
matching the language of a given query, Engenium's Semetric concept
search engines and integrated Autometric clustering engine analyze
documents by meaning, concept and context. This yields better, faster
search results --- uncovers information that otherwise would remain
buried --- and enables organizations to work smarter.&quot;
</p>
<p>They seem to be applying latent semantic analysis (a kind of
principal component analysis) to search and clustering.
</p>


<h3><a href="http://www.evri.com/">Evri</a></h3>

<p>&quot;Using semantic understanding of content, Evri is building the data graph of the web. We'll use this to create interesting and meaningful connections without having to search.&quot;</p>


<h3><a href="http://www.exalead.com/software/products/cloudview/">Exalead</a></h3>

<p>&quot;Exalead CloudView is a one-of-a-kind search engine that collects unstructured and structured data from any source, in any format and in any volume, and automatically transforms it into a single structured information resource. This resource, which continually evolves and adapts as your data evolves, can be directly searched or used to develop innovative search-based applications (SBAs).&quot;</p>


<h3><a href="http://www.expertsystem.net/">Expert System</a></h3>
<p>Expert System is a &quot;leading provider of Semantic Intelligence software to discover, classify, and understand information contained in unstructured text. Expert System technology,  COGITO enables natural language processing. It leverages full semantic analysis to automatically understand the content from any textual document, including the retrieval of meanings and the comprehension of natural language.
</p>
<p>
Semantic Intelligence enables you to read, understand, and extract the most relevant concepts present in the huge amount of documents, websites, presentations, emails and blogs that are accessible to us everyday. &quot;
</p>


<h3><a href="http://extractiv.com/">Extractiv</a></h3>

<p>&quot;Extractiv is a new content provisioning service tha  helps consumers &quot;make sense&quot; of large amounts of unstructured text. We use natural language processing in conjunction with one of the world best distributed computing platform in order to turn text into structured data that can be used in a variety of apps, such as sentiment tracking or semantic search.&quot;</p>


<h3><a href="http://www.fastsearch.com/">Fast</a></h3>
<p>
&quot;We do not simply search, we find. We filter out all the
irrelevant, peripheral data and provide the exact information end
users are looking for.&quot;
</p>
<p>&quot;We have solutions that monitor competitive intelligence,
provide brand and litigation protection, support regulatory and policy
compliance, and investigate criminal and terrorist activity. They
don't just return results, they return confidence and
protection.&quot;
</p>


<h3><a href="http://www.fetch.com/index.asp">Fetch</a></h3>
<p>
&quot;Fetch Technologies provides innovative solutions for integrating
and accessing heterogeneous data sources.&quot;
</p>
<p>
Fetch isn't so much a direct competitor, but more of a complementary
technology aimed at scraping web pages and record linkage (also known
as database deduplication).
</p>


<h3><a href="http://www.gd-ais.com/">General Dynamics Advanced Information System</a></h3>

<p> &quot;General Dynamics Advanced Information Systems designs,
develops, manufactures, and integrates information solutions for
defense, intelligence, space and homeland security communities.&quot;
</p>

<p> &quot;General Dynamics Advanced Information Systems uses data
mining technologies to help customers find new correlations, patterns
and trends. We use advanced technology to sift through large amounts
of data (structured data, text, audio, video, etc.) stored in
repositories and use pattern recognition and statistical and
mathematical techniques.&quot; In particular, their system
&quot;successfully performs entity extraction, a natural language
processing technique, to derive facts such as names, places,
organizations, locations and time from text.&quot;</p>


<h3><a href="http://grammarsoft.com/">GrammarSoft</a></h3>
<p>&quot;GrammarSoft ApS is a small company specializing in Language
Technology.&quot; Product offerings (for multiple European languages)
include morphological analyzers, part-of-speech taggers,
syntactic/dependency parsing, named-entity recognition, translation,
and tools for teaching language and spell checking.  They are a
spinoff of the <a href="http://beta.visl.sdu.dk/">Visual Interactive
Syntax Learning</a> (VISL) project.
</p>


<h3><a href="http://www.h5.com/solutions/h5edgeclassifier.html">H5</a></h3>
<p>
H5 builds classifiers for enterprise document management, especially in the
legal domain.
</p>

<h3><a href="http://labs.hakia.com/hakia-lab.html">hakia-Lab</a></h3>
<p>
&quot; Different than a familiar R&amp;D agenda in a search engine
company, we undertook highly specific research tasks solely dedicated
to the advancement of the core-competency in Web search. The main
challenge is to make science work in a constrained deployment
environment where speed, coverage, accuracy, and ease-of-use are high
priority considerations.&quot;
</p>
<p> 
hakia-Lab provides several technologies, including <a
href="http://labs.hakia.com/hakia-lab-onto.html">OntoSem</a>, &quot;a
formal and comprehensive linguistic theory of meaning in natural
language&quot;, <a href="http://labs.hakia.com/hakia-lab-qdex.html">QDEX</a>,
&quot;Query Detection and Extraction (QDEX) system was invented to bypass the limitations of the inverted index approach when dealing with semantically rich data&quot;, <a href="http://labs.hakia.com/hakia-lab-sema.html">SemanticRank</a>, &quot;a collection of methods to score and rank paragraphs&quot;, and
<a href="http://labs.hakia.com/hakia-lab-dial.html">Dialogue</a>,
&quot;the conversational (dialogue) systems where the search engine communicates with the user in an elevated level of confidence&quot;.
</p>

<h3><a href="http://www.hotneuron.com/">Hot Neuron</a></h3>
<p>
&quot;Clustify  groups related documents together into clusters and labels each cluster with keywords to tell you what it is about. It does both conceptual clustering and near-duplicate detection. This gives you a quick overview of the document set, and makes categorization of the documents more efficient and consistent. Clustify can process millions of documents on a desktop computer, bringing organization to large projects.&quot;
</p>
<p>
&quot;Hot Neuron Similarity uses a proprietary algorithm to quantify the similarity of a pair of documents. This software is demonstrated on MagPortal.com where articles that are determined to be sufficiently similar are marked in a database. The user can click on the icon next to an article he/she likes to retrieve the list of similar articles.&quot;
</p>

<h3><a href="http://infobright.com/">Infobright</a></h3>
<p>
&quot;The Infobright self-tuning analytic database delivers high performance without the work and cost of other solutions.&quot;
</p>

<h3><a href="http://www.infogistics.com/index.html">Infogistics</a></h3>
<p>
&quot;Infogistics are one of the leading companies providing
text-analysis, content extraction and document retrieval solutions
across multiple areas of industry including HR, law enforcement,
knowledge management and CRM.&quot;
</p>
<p>
&quot;Using advanced Natural Language processing technology developed
at Edinburgh University, Infogistics solutions enable information and
data contained in structured or unstructured text documents to be
retrieved, categorised, extracted and delivered to the right people at
the right time.&quot; Their NLP offerings include sentence
detection, part of speech tagging, and light syntactic chunking.  They
also offer higher-level products for specialized search, relationship
extraction and document parsing.
</p>

<h3><a href="http://www.inform.com">Inform</a></h3>
<p>
Inform does &quot;precise topic-based search for related
content&quot;.  Their technology involves text classification and
entity extraction for more-like-this applications.  You can
also check out the <a href="http://inform.com/index.aspx">Inform News Demo</a>.
</p>

<h3><a href="http://www.inquira.com/">Inquira</a></h3>
<p>
&quot;InQuira helps companies deliver more effective customer
service through their Web sites and contact centers.&quot; Their
product features &quot;integrated capabilities for natural language
search, knowledge base management, and analytics&quot;.
</p>
<p> 
Their product line includes <a
href="http://www.inquira.com/products_search.asp">InQuira Intelligent
Search</a>, &quot;a unified system that combines advanced linguistic
techniques and contextual understanding to provide unparalleled
capabilities for understanding, and responding to, the true intent
behind a users inquiry and browsing behavior.&quot;
</p>

<h3><a href="http://www.intelliresponse.com/">Intelliresponse</a></h3>
<p>
&quot;IntelliResponse delivers on the promise of web self-service by providing one right answer to visitor questions.&quot; 
</p>
<p>
Intelliresponse mainly works in question answering and
classification in the context of customer relationship management
(CRM).  About this they say their &quot;Patented 'one right answer'
solution understands precisely what the visitor wants, regardless of
the hundreds of ways a specific question can be asked&quot; They hedge
this a bit later by saying &quot;While it's not possible (or a good
idea) to create an IntelliResponse knowledge base that would answer
every possible question that anyone would ask, it is very possible to
create one that will answer upwards of 90% of incoming
questions.&quot;
</p>
<p>
You can even try it by entering a question on their home page.
</p>


<h3><a href="http://www.irion.nl/english/index_en.html">Irion</a></h3>

<p>&quot;Irion Technologies has succesfully picked up the challenge to
make computer programmes that make sense out of text, and really
understand human language.&quot;</p>

<p>&quot;Irion's software improves any web communication involving
human language, and applies to any organization in the world dealing
with textual information. This includes conceptual search, knowledge
management, E-commerce, customer support, and many other
applications.&quot;  Their technology seems to be
organized around classification.</p>


<a name="janya"></a>
<h3><a href="http://janyainc.com/">Janya</a></h3>

<p> &quot;Janya provides products and services to support information
discovery from unstructured and semi-structured data. With more than a
decade of experience developing and integrating this technology, Janya
works with customers and system integrators to incorporate information
discovery capability in both unclassified and classified
environments. By leveraging existing search technology and tools for
structured data analysis, Janya's solutions enable users to increase
their effective bandwidth to analyze text data streams.&quot;
</p>
<p>
Janya builds <a
href="http://www.janyainc.com/products/products_semantex.php">Semantex</a>,
&quot;an enterprise-class information extraction system that supports
the automatic or semi-automatic analysis of large volumes of
electronic information in order to detect entities, attributes,
relationships and events. Semantex represents a hybrid model for
information extraction, combining machine-learning and grammatical
approaches to achieve better results than any of the techniques could
individually.&quot; They also build case restoration and &quot;text
zoning&quot;.  </p>
<p>
Note that Janya was a spinoff of Cymfony, and Cymfony is
now part of <a href="#tns">TNS Media Intelligence</a>.
</p>


<h3><a href="http://www.jodange.com/">Jodange</a></h3>
<p>
&quot;Predicting outcomes and planning strategy through deeper understanding of opinion holders' sentiment over time.&quot;
</p>
<p>
They supply an <a href="http://www.jodange.com/downloads.html">extensive list of whitepapers</a>.
</p>


<h3><a href="http://www.justsystems.com/global/products/index.html">Just Systems</a></h3>

<p>&quot;Based on our natural language processing technology, this software was designed for corporations in collaboration with Dr. David Evance*, then a professor at Carnegie Mellon University. It enables sophisticated crossover search based on spoken language for text data on intranet, groupware and data warehouse (DWH).&quot;</p>


<h3><a href="http://www.languagecomputer.com/">LCC (Language Computer Corporation)</a> </h3>
<p>
LCC has a suite of "Cicero" tools (CiceroLite, CiceroCustom) which do
content extraction.
</p>


<h3><a href="http://www.nuance.com/nlp/">Language and Computing</a>--acquired by Nuance</h3>
<p>
&quot;Natural Language Processing (NLP) and Natural Language
Understanding (NLU) are technologies that can extract data and
information from free text documents for further processing. Language
and Computing (L&amp;C) is unique in delivering this level of
understanding through its integration of the world's largest medical
ontology with sophisticated linguistic processing algorithms.&quot;
</p>
<p>
A recent <a href="http://www.nuance.com/nlp/MeaningfulUse_WP210QP-fin.pdf">white paper.</a>
</p>

<h3><a href="http://www.languagecraft.com/en/qa.html">Language Craft</a></h3>
<p>
&quot;Q and A system will allow you to use natural language to ask questions. The system is also capable of replying with the direct answer for the questions, rather than just showing related documents that contain the keyword in the questions.&quot;
</p>

<h3><a href="http://www.lexmasterclass.com/">Lexicography Master Class</a></h3>
<p> 
&quot;Lexicography MasterClass Ltd is a company specializing in
lexicography and lexical computing. We run training courses, design
and build language corpora, supply lexicographic software, and provide
a complete project-management service for lexicographic projects, from
conception to delivery.&quot;
</p>

<h3><a href="http://www.leximancer.com">Leximancer</a></h3>
<p>
&quot;Leximancer is a software tool that enables users to find meaning
from text-based documents. It automatically identifies key themes,
concepts and ideas from unstructured text with little or no
guidance.&quot;
</p>
<p>
Leximancer is focusing on word-based classification and automatic
taxonomy/synonym generation.
</p>

<h3><a href="http://www.lextek.com/">Lextek International</a></h3>
<p>
&quot;Lextek International supplies advanced information retrieval and natural language processing technology.&quot;
</p>
<p>
&quot;Our technologies are used in a wide variety of business
solutions. These range from document management systems to custom web
based applications.&quot;
</p>
<p>
Lextek supplies a general document classification engine,
a language identifier, and a documnt summarizer.
</p>

<h3><a href="http://en.lingjoin.com/product/ljparser.html">Ling-Join Software</a></h3>
<p>
&quot;With the strong technical accumulation and unique technical advantages, we can provide a comprehensive solution for users with searching and natural language understanding. Solutions based around the NLU SDK Modules, Text Mining Middleware, Web Application Engine, and others engine middlewares. It is convenient to build a variety of business applications with the flexible combination of multi-level and multi-angle.&quot;
</p>

<h3><a href="http://www.linguamatics.com/">Linguamatics</a></h3>
<p>
&quot;Linguamatics enables organizations to reap maximum return from
their available knowledge assets, by the effective deployment of
advanced natural language processing technology.&quot;
</p>
<p>
They extract entities and relations from text, resolving them
against ontologies.
</p>

<h3><a href="http://www.linguastat.com/">Linguastat</a></h3>
<p> &quot;Based on its patent-pending proprietary technology,
Linguastat offers a fully automated, web-based intelligence service
that automates a variety of business tasks that require reading and
analysis.&quot; </p>
<p>&quot;Linguastat can read, analyze, and continuously monitor any form of
electronic text: web pages, emails, documents, messages, chat logs,
transcripts, and other digital text.&quot;
</p>
<p>&quot;Linguastat can target and capture any number of user defined
messages: facts, statement, events, issues, opinions, concepts, and
topics to cover any precise area of interest.&quot;  </p>


<h3><a href="http://www.lingway.com/">Lingway</a></h3>
<p>
Lingway is a &quot;specialized search engine company&quot;
&quot; Lingway solutions are built around a set of NLP
components. These enable users to develop specific applications or to
enhance existing applications by adding linguistic capabilities.&quot;
</p>
<p> The list of components they provide is &quot; chunking,
clustering, parsing, semantic expander, spell checker, tagging, word
sense disambiguation ...&quot;
</p>


<h3><a href="http://www.lockheedmartin.com/products/AeroText/index.html">Lockheed Martin</a></h3>
<p>
&quot;The AeroTextTM product suite provides a fast, agile information
extraction system for developing knowledge-based content analysis
applications. Possible applications include automatic database
generation, routing, browsing, summarizing and searching.&quot;
</p>
<p>
They do named entity extraction, coreference resolution,
part-of-speech and phrase extraction, clustering, topic
categorization, and &quot;event&quot; extraction.  They train by
example and have special capabilities for reasoning about locations
and times.  They offer language support for English, Arabic, Chinese,
Spanash and Indonesian.
</p>


<h3><a href="http://www.love.com/">Love.com</a></h3>

<p>&quot;Love.com is a fully-automated service that provides a selection of the latest news, videos, and images from around the World Wide Web based on a search term and display of such search results in a familiar blog-styled format. Love.com web search results include a real-time listing of websites and other content relevant to the search query. As with other automated services, the systems that power Love.com are agnostic to the specific terms searched including individual entities, sentiment, or context.&quot;</p>


<h3><a href="http://www.lexalytics.com/">LXA Lexalytics</a></h3>
<p>
&quot;Designed to help our customers address the basic problem of making their loosely structured information more valuable. We have created a set of products that attack the problem of discovering, understanding and acting on information that affects their business.&quot;
</p>
<p>
Lexalytics's products include
entity extraction, relation extraction, document summarization, sentiment analysis,
and classification.
</p>

<h3><a href="http://www.lymbix.com/">Lymbix</a></h3>
<p>
Limbix has &quot;designed a suite of enterprise and social solutions built on our unique ability to precisely determine the tone in text-based communication. Whether it's deeper social media analytics or a solution for improved enterprise communication you're looking for, you can bet we have a product offering available (or in development) to match. Our team is ready to work with you.&quot;
</p>

<h3><a href="http://www.mmodal.com/technology.jsp/">M*Modal</a></h3>

<p>&quot;M*Modal's state-of-the-art Speech Understanding technology translates physician dictation in real-time into a searchable, structured document.&quot; </p>


<h3><a href="http://www.matrixware.com/">Matrixware</a></h3>

<p>&quot;Utilizing and implementing up to date research results in the
fields of computer science, language technology, and information
theory we at Matrixware enable our customers to skilfully navigate the
endless sea of patent literature.&quot;</p>



<h3><a href="http://www.meaningfulmachines.com/index.htm">Meaningful Machines</a></h3>

<p>&quot;Meaningful Machines develops, patents, and commercializes language
technologies based on a unique suite of methods that automate machine
understanding of natural language. The company is developing
technologies for use in machine translation (MT), text mining, machine
learning, and other applications that benefit from machine
understanding.&quot;
</p>
<p>Although they cite very general problems, their
<a href="http://www.meaningfulmachines.com/technologies.htm">technologies
page</a> only addresses machine translation.
</p>

<h3><a href="http://www.megaputer.com/">Megaputer</a></h3>
<p>
&quot;TextAnalyst
will help you quickly summarize, efficiently navigate, and
cluster documents in your textbase.&quot;
</p>
<p>Megaputer's TextAnalyst product extracts semantic networks,
summarizes text using a &quot;balanced combination of
linguistic and neural network investigation methods&quot;.  Their notion
of semantic network is &quot;the most important concepts from
the text and the relations between these concepts weighted by
their relative importance&quot;.  They also use these semantic
networks for clustering and exploration/search.
</p>

<h3><a href="http://www.metacarta.com">MetaCarta</a></h3>
<p>
&quot;<a
href="http://www.metacarta.com/products-data-modules.htm">GDMs</a>
[Geographic Data Modules] make up the core of MetaCarta products and hosted solutions. A GDM is a knowledge base used to identify and disambiguate geographic references, assign latitude/longitude coordinates, and confidence scores and relevance ranking. Each MetaCarta GDM contains linguistic statistics, gazetteer data, and natural language processing (NLP) logic.&quot;
</p>
<p>
&quot;MetaCarta <a
href="http://www.metacarta.com/products-geoweb-applications-geositesearch.htm">GeoSiteSearch</a>
is a Web portal drop-in that enables users to conduct a geographic search on any content, with search results ranked on location and text relevance.&quot;
</p>
<p>
MetaCarta specializes in location mention recognition and resolution
in text.  They use probabilistic models with confidence ranking.
</p>

<h3><a href="http://www.mnemonic.com">Mnemonic Technology</a></h3>
<p>
&quot;At Mnemonic, we can help you realize the full value of your information, whether structured or unstructured.&quot;
</p>
<p>
&quot;Our relevance models help you get the right information to the right people at the right time. They automatically learn to prioritize, categorize, monitor and summarize large volumes of unstructured textual information according to the unique requirements of individual users.&quot;
</p>
<p>
From what we can tell from their web site, their &quot;relevance
models&quot; learn scored text classifiers by example.  The only
application mentioned for text analytics is search query refinement.
</p>

<h3><a href="http://www.morphologic.hu/en/">Morphologic</a></h3>
<p>
Morphologic provides a range of products, mostly arranged around
morphologically sensitive bilingual translation dictionaries,
including thesauri.  Applications include copy editing such as
spelling checkers, hyphenators, grammar and style checkers; text
search tools including stemmers; and translation of full documents.
Tools include morphological analyzers including stemmers based on
unification grammars, syntactic analyzers, spell checkers and language
identifiers.
</p>

<h3><a href="http://www.netowl.com/">NetOwl</a></h3>
<p>
Extractor: &quot; Accurately perform entity extraction from
unstructured texts using advanced computational linguistics and
natural language processing.&quot;
</p>
<p>
Summarizer: &quot;Reliably generate abstracts and summaries of long
and complex documents.&quot;
</p>
<p>TextMiner: &quot;Empower users to find, organize, analyze, and mine a large volume of unstructured information using the the most advanced text analysis technology available.&quot;
</p>
<p>
They also run in several languages.  They have some kind of
cross-document coreference.  They spun out of <a
href="http://www.sra.com/">SRA International</a>. They claim to have
&quot;best-of-breed entity extraction&quot; and &quot;unique link and
event extraction&quot;, but don't explain what the breed is and don't
list any unique features of their link and event extractors.  They
even claim &quot;NetOwl posted the highest score ever achieved for
name extraction from unformatted text, a score which has never been
equaled by another system.&quot;, but don't provide any details.
</p>

<h3><a href="http://www.nextit.com/ProductOverview.ashx">Next IT</a></h3>
<p>
&quot;Next IT's Human Emulation Software, ActiveAgent, creates Virtual Experts for organizations that are redefining customer service through technology.  As a single solution, ActiveAgent accurately understands and interprets users' natural language questions and delivers exact results. ActiveAgent leverages an organization's entire asset and resource portfolio through multiple service channels such as the web, contact center, intranet, mobile devices, and more with accuracy, scalability and operational efficiency.&quot;
</p>

<h3><a href="http://www.northernlight.com">Northern Light</a></h3>
<p>
&quot;What if your search engine could read all the market
intelligence reports and articles your company creates or licenses and
tell you what is in them, suggest to you what the business issues are
that they report on, and direct you to the documents that are the most
interesting to you, not from a search relevance perspective, but from
a meaning perspective?&quot;
</p>
<p>
Northern Light offers <a href="http://www.northernlight.com/mianalyst.html">Market Intelligence Analyst</a>, which contains entity extraction, sentiment analysis, relationship identification, meaning extraction and trend analysis.
There's a bit more information in their <a href="http://www.northernlight.com/downloads/MI_Analyst_Product_Sheet.pdf">MI Analyst Product Sheet</a>.
</p>

<h3><a href="http://northsideinc.com/">Northside</a></h3>
<p>
&quot;Were developing software that understands English, and can converse with people about a body of facts.&quot;.  Looks like they're still in the development phase,
but plan to use parsing to logical representations, ontologies
and natural language generation.&quot;
</p>

<h3><a href="http://www.nstein.com/ntelligent.asp">Nstein</a></h3>
<p>
&quot;Ntelligent Enterprise Search by Nstein is a powerful search
solution built to increase the efficiency and productivity of your
employees on intranets and portals. For public websites, it will guide
your customers in the most advanced discovery process. It quickly
delivers highly accurate search results in all circumstances.&quot;
</p>
<p>
&quot;One suspicious letter. One dangerous passenger boarding a
routine flight. One viral infection in a small village, in a distant
country. These events have led to tragedies we all wish had never
happened. Critical information preceeded these events. Critical
information that could have been flagged by Nstein Technologies.&quot;
</p>

<h3><a href="http://www.nuxeo.com/en/products/ep">Nuxeo</a></h3>
<p>
&quot;Nuxeo Enterprise Platform (Nuxeo EP) is a Java-based content infrastructure designed to be used as a development environment for content- and case-based applications. Nuxeo EP is an extensible and configurable set of ECM services and modular plug-ins that allows an organization to build out specific horizontal or vertical applications.&quot;
</p>

<h3><a href="">Ontoprise</a></h3>
<p> 
&quot;SemanticMiner  utilizes the strengths of ontologies. These are knowledge models representing the relevant expertise within your department or company. They facilitate, moderated searches, optimization of the search results, unified view onto diverse sources.  With SemanticMiner, users find the relevant information required much faster and easily.  The SemanticMiner is available in two flavors: with a pre-configured web-interface or as a set of web services. These can be integrated into any SOA-compatible application or user interface.&quot;
</p>


<h3><a href="http://www.ontotext.com/">Ontotext</a></h3>
<p> 
&quot;Ontotext is a leading developer of core semantic technology,
which delivers applications in domains like Web Mining, EAI, KM, BI,
and Media Research.&quot; 
</p>
<p> 
&quot;Ontotext is a laboratory of <a
href="http://www.sirma.com/">Sirma</a>, active in several
research areas, including: Ontology Management; Information Extraction
and Retrieval (IE, IR); Semantic Web Services.&quot; Their products
include the <a
href="http://ontotext.com/kim/semanticannotation.html">KIM
Platform</a> for semantic annotation driven by a &quot;<a
href="http://ontotext.com/inference/semantic_repository.html">semantic
repository</a>&quot;.  The semantic annotation includes named-entity
extraction from a specified ontology.  They have <a href="http://ontotext.com/gate/index.html">contributed extensively to GATE</a> (for more info on GATE,
<a href="#gate">see above</a>).
</p>


<h3><a href="http://www.ostglobal.com/content/services/integrateditsolutions.htm">Optimal Solutions and Technologies</a></h3>

<p>&quot;OST develops tools and integrates resources that are a precise fit to our client's needs, size, and budget. Unlike many other businesses, we are not married to specific manufacturers or solutions providers that will limit the scope of our offerings.  Our Services Include Computational Linguistics&quot;</p>


<h3><a href="http://www.oracle.com/technology/products/bi/odm/index.html">Oracle Data Mining</a></h3>

<p>&quot;Oracle Data Mining (ODM) -- an option to Oracle Database 11g
Enterprise Edition -- enables you to easily build and deploy
next-generation applications that deliver predictive analytics and new
insights. Application developers can rapidly build next-generation
applications using ODM's SQL and Java APIs that automatically mine
Oracle data and deploy results in real-time-throughout the
enterprise.&quot;</p>

<p>Although not specifically aimed at natural language, there
is quite a bit of NLP-relevant technology in it, inlcuding naive Bayes,
decidsion trees, logistic regression, and support-vector machines.  </p>


<h3><a href="http://panscient.com/">Panscient</a></h3>

<p>&quot;Panscient is a content supplier for vertical search
engines.&quot; They have the interesting business model of supplying
<a href="http://panscient.com/products.htm">lists of people and
businesses</a> scraped from the entire <code>.com</code> domain of
corporate web sites and updated monthly.
They also develop <a href="http://panscient.com/services.htm">vertical
search applications</a>.
</p>

<h3><a href="http://www.paritycomputing.com/web/index.html">Parity
Computing</a></h3>

<p>&quot; Parity Computing's unstructured data management and
knowledge discovery solutions transform disparate data and content
into a knowledge network of actionable profiles and linked
relationships.&quot; </p>

<p>Parity offers the <a
href="http://www.paritycomputing.com/web/products/profiler_platform.html">Profiler System</a>,
which &quot;assembles and analyzes detailed profiles of key entities
such as people, institutions, and products, from disparate
unstructured documents and semi-structured data sources. ...
The key entities are extracted and assembled into distinct profiles using advanced machine learning heuristics. This includes normalization of spelling variations together with disambiguation of similarly-named entities (e.g. two people with the same name).&quot;
Additional functionality cited includes home page finding, extracting
patent references from web pages, etc.
</p>

<p>Parity also offers a lower level tool, the <a href="http://www.paritycomputing.com/web/products/reference_processor.html">Reference Processor</a>, a
&quot;fully automated software engine for high-accuracy reference processing and linking of publication databases and bibliographies in arbitrary formats.&quot;.  Technology includes extraction, deduplication, clustering and
correction.</p>


<h3><a href="http://www.phrasetrain.com/">Phrasetrain</a></h3>
<p>
&quot;We're creating natural language technology that grows and improves as
it collects simple human judgments about language.  Using this
technology, we're building tools to search blog posts, feeds, and
other texts for key concepts, not just keywords and phrases. Think of
it as tagging meets natural language processing.&quot;  As of
February 2007, they only offered a mailing list.
</p>


<h3><a href="http://popupchinese.com/">Popup Chinese</a></h3>
<p>
Popup Chinese is a natural language processing engine for Chinese text. It uses a combination of a dictionary and statistical methods to intelligently and contextual segment text, identify parts of speech and manipulate text. The software provides hanzi-to-pinyin conversion, text segmentation and machine translation for a variety of applications including search, content extraction and more. It supports POS tagging and word sense disambiguation. The software is coded in object oriented C++ and released under an open source license permitting commercial use for hanzi-to-pinyin conversion, text segmentation and machine translation purposes.
</p>


<h3><a href="http://www.purediscovery.com/">PureDiscovery</a></h3>

<p>&quot;PureDiscovery, a software company based in Dallas Texas, is reinventing the art of semantic search by harnessing collective intelligence. PureDiscovery is the creator of KnowledgeGraph, an intelligent software platform that transforms an organization's documents into a working collective intelligence. PureDiscovery KnowledgeGraph semantically connects people and knowledge in ways and on a scale that simply was not possible before.

We serve a variety of markets including: Legal, Human Capital Management, HomeLand Security / Defense,
and Intellectual Property
&quot;</p>


<h3><a href="http://www.q-go.nl/">Q-go</a></h3>

<p>&quot;Natural Language Search&quot;.</p> <p>Q-go offers the <a
href="http://customer.q-go.net/qgo_us/products_nlssuite.html">Q-go
Natural Language Search Product Suite</a>.  &quot;Q-go's Natural
Language Search gives organizations insight into visitors'
expectations and wishes and helps them adapt their online information
accordingly.&quot; &quot;The answers provided by Q-go are comparable
to those of call centers in terms of consistency, completeness and
quality, which is not only cheaper but also faster and easier for
organizations.&quot;</p>


<h3><a href="http://www.ql2.com/">QL2</a></h3>
<p>
&quot;QL2 Software's tools and solutions deliver business critical
data seamlessly and in real-time. QL2's technology integrates data
from virtually any source, inside and outside the firewall, with
existing applications and solutions. The result is better analytics
and smarter, more profitable decisions.&quot;
</p>
<p>
It wasn't clear from the web site whether any natural language
processing was involved in their products.
</p>

<h3><a href="http://rapid-i.com/">Rapid-i</a></h3>
<p>
&quot;RapidMiner (formerly YALE) is the world-leading open-source
system for knowledge discovery and data mining. It is available in
different flavours: a free open-source version licensed under the GPL,
a free version with an improved user interface, and under a developer
license (OEM) which allows the integration of RapidMiner as a powerful
library even into proprietary products. Enhance your products with
adaptability and innovative analytical features. By now, thousands of
applications of RapidMiner in more than 30 countries give their users
a competitive edge.&quot;
</p>

<h3><a href="http://www.recommind.com/">Recommind</a></h3>
<p>&quot;Sophisticated Search Review and Analysis Made Simple&quot;</p>
<p>
&quot;<a href="http://www.recommind.com/mindserver_categorization.html">MindServer Categorization</a>
automatically maps structured and unstructured information into an information structure - taxonomy, ontology, or subject heading classification.&quot;
</p>
<p>
&quot;The core technology powering Recommind's MindServer platform is based on patented, proprietary machine learning techniques including the Probabilistic Latent Semantic Analysis (PLSA) algorithms.&quot;
</p>

<h3><a href="https://www.recordedfuture.com/">Recorded Future</a></h3>
<p>
&quot;Temporal Analytics Engine:  A predictive analysis tool that allows you to visualize the future, past or present. How Recorded Future works:  1. Scour the web:  We continually scan thousands of high-quality news publications, blogs, public niche sources, trade publications, government web sites, financial databases and more. 2. Extract, analyze and rank:  We extract information from text including entities, events, and the time that these events occur. 3. Make it useful:  You can explore the past, present and predicted future of almost anything.  Powerful visualization tools allow you to quickly see temporal patterns, or link networks of related information.&quot;
</p>

<h3><a href="http://www.reeltwo.com/">Reel Two</a></h3>
<p>
&quot;Reel Two is tackling the tough problems in search and data analysis. Our software products and custom solutions provide scientists, analysts and managers with quick, intuitive access to the information that is most relevant to their work.&quot;
</p>
<p>
Reel Two spun out of the WEKA group at Waikato, and is primarily focused
on <a href="http://www.reeltwo.com/index.php?page=products&amp;subpage=cs">text classification</a> and <a href="http://www.reeltwo.com/index.php?page=products&amp;subpage=ee">entity extraction</a> as well as additional biomedical applications aimed at chemical name resolution.
</p>

<h3><a href="http://www.riverglassinc.com/">RiverGlass</a></h3>
<p>
&quot;. Our groundbreaking data analytics technology is designed to deliver intelligent monitoring and analysis of unstructured data, all within the context of the analysis environment. With RiverGlass tools, organizations can make unstructured data sources like the Internet into true strategic information resources to help drive their success.&quot;</p>
<p>
From their <a href="http://www.riverglassinc.com/technology/capabilities.php">technology
capabilities page</a>, they appear to be doing search with ontology
integration for topic monitoring, entity extraction and link analysis.
</p>

<h3><a href="http://www.sap.com/solutions/sapbusinessobjects/index.epx">SAP BusinessObjects</a></h3>
<p>
&quot;SAP BusinessObjects offers a broad portfolio of tools and applications designed to help you optimize business performance by connecting people, information, and businesses across business networks.&quot;
</p>
<p>
Among its operations is <a href="http://www.sap.com/solutions/sapbusinessobjects/large/business-intelligence/search-navigation/intelligent-search/featuresfunctions/index.epx">Intelligent Search</a> and <a href="http://www.sap.com/solutions/sapbusinessobjects/large/information-management/data-integration/textanalysis/index.epx">Text Analysis</a> software to &quot;extract, categorize, and summarize key information from unstructured text and convert it to a structured format so that it can be an effective data source for data integration or business intelligence.  Process in more than 30 languages.&quot;
</p>


<h3><a href="http://www.sas.com/text-analytics/text-miner/index.htm">SAS</a></h3>
<p>
&quot;SAS Text Miner incorporates advanced linguistic capabilities within the core data mining solution of SAS Enterprise Miner. Consolidating structured (quantitative) data analysis with unstructured (free-form text) provides complete views and meaningful insights within an integrated predictive modeling environment. Automating manual comprehension of the textual data sources, incorporating interactive drill-down reporting, and delivering algorithms for rigorous advanced analyses make it possible to grasp future trends and act on new opportunities more efficiently and with less risk.&quot;
</p>
<p>
&quot;The <a href="http://www.teragram.com/oem/">Teragram</a> division licenses its proprietary software and Linguistic data to other companies who embed our technologies into their corporate, Internet or software applications. Teragram has established itself as the leading technology and service provider for technologies such as Linguistics, pattern matching, Linguistic search and retrieval, international dictionaries for search and e-commerce, document management, and high demand Internet applications and services. By leveraging Teragram technologies, our customers are able to create new products quickly; improve the performance of their own products; expand their business to European, Arabic, and Asian markets; manage information more efficiently; and provide new functionalities to their own customers.&quot;</p>

<h3><a href="http://www.scoutlabs.com/">Scout Labs</a></h3>
<p>
&quot;A powerful, web-based application that tracks social media and finds signals in the noise
to help your team build better products and stronger customer relationships.&quot;
</p>
<p>
Their <a href="http://www.scoutlabs.com/features/">Scout Labs Product</a> includes
sentiment analysis, summarization and search over &quot;consumer-generated media&quot;.
</p>

<h3><a href="http://www.sdl.com/en/language-technology/products/automated-translation/sdl-language-weaver.asp">SDL Language Weaver</a></h3>
<p>
&quot;As the pioneer of statistical machine translation, Language Weaver provides trusted automated translation solutions to improve human communications for government and commercial organizations. Delivering a trusted level of translation quality.  Language Weaver ensures that organizations can communicate with global audiences in over 75 language combinations and is the only provider of automated translation with the capability to provide a TrustScore against each translation. TrustScore provides a predication of the translation quality by the translation engine. This enables customers to build business rules around TrustScore and automate translation and publishing processes.&quot;
</p>

<h3><a href="http://www.semantia.com/">Semantia</a></h3>
<p>
&quot;Semantia is the expert in the automatic processing of natural written language for optimizing customer interactions.&quot;
</p> 

<h3><a href="http://www.semantra.com">Semantra</a></h3>
<p>
&quot;Semantra extends traditional BI and enterprise search applications, by empowering users to quickly and easily access precise, critical information from enterprise databases through a familiar search box and natural language.&quot;
</p>

<h3><a href="http://serendio.com/index.php">Serendio</a></h3>
<p>
Serendio's &quot;Customer Experience Analytics (CxA) solution derives insights from social media, emails, surveys, call center narratives and other forms of customer-centric content to provide a precise analysis of customer sentiments, perceptions, needs and wants...through a highly-available and secure SaaS application.&quot;
</p>
<p>
&quot;Based on Natural Language Processing (NLP) techniques, our <a href="http://serendio.com/technology">DisKoveror</a> platform supports a combination of statistical, linguistic and ontology based approach to deeper understanding of the text. This approach results in highly precise and comprehensive extraction of entities, facts and relationships.&quot;
</p>

<h3><a href="http://www.silobreaker.com/ProductsAndServices.aspx">Silobreaker</a></h3>
<p>
&quot;Silobreaker is an automated search service for news and current affairs that aims to provide more relevant results to the user than what traditional search and aggregation engines have been offering so far. Instead of returning just lists of articles matching a search query, Silobreaker finds people, companies, organizations, topics, places and keywords; understands how they relate to each other in the news flow, and puts them in context through graphical results in its intuitive user interface. The search result pages look similar to an online newspaper but are generated without human editing.
</p>
<p>
The site aggregates content on global issues, science, technology, energy, business, sports and entertainment from tens of thousands of news sources, blogs, multimedia, and other forms of news media from around the world. With the engine's focus on finding and connecting related data in the information flow, Silobreaker user tools and visualizations are ideal for bringing meaning to content from either today's Web or the evolving Semantic Web, or both.&quot;
</p>

<h3><a href="http://www.sinequa.com/">Sinequa</a></h3>
<p>
&quot;Sinequa is an innovative leading global provider of
Enterprise Search Solutions.&quot;
</p>
<p>
&quot;<a href="http://www.sinequa.com/solutions.html">Sinequa CS</a> has been developed by Sinequa as the ultimate, multi-lingual knowledge access platform. Featuring cutting-edge semantic and linguistic technologies, Sinequa CS is one of the most advanced Enterprise Search solutions available today.
</p>

<h3><a href="http://www.soliloquy.com/">Soliloquy</a></h3>
<p>
&quot;Soliloquy is the world's first company to offer 'intelligent,'
fully automated solutions that enable end users to find the
information, services and products they desire through targeted online
dialogs.&quot;
</p>
<p>
Soliloquy is in the business of <a href="http://www.soliloquy.com/solutions/dialog_mining.php">dialog mining</a>, a kind of text data mining over
dialogues. They claim to be &quot;the world's first company to offer turnkey solutions that enable end users to find the information and products they desire through intelligent, targeted dialogs.&quot;
</p>

<h3><a href="http://www.spss.com/software/modeling/modeler/">SPSS</a></h3>
<p>
&quot;PASW Modeler makes it easy to discover insights in your data. Its simple graphical interface puts the power of data mining in the hands of business users and high-performance increases analyst productivity.&quot;</p>

<h3><a href="http://ilk.uvt.nl/~stil/">STIL Language Technology</a></h3>

<p> &quot;Consultancy and software.  Bringing cademic research to
business.&quot; &quot;STIL can offer solutions to businesses and
non-profit organizations that have a need to explore what language
technology could offer them, or with a need to integrate language
technology components in their information systems.&quot; </p>

<p>STIL offers software based on the TiMBL memory-based learning
system, including sequence taggers, shallow parsers, word-sense
disambiguation and morphological analysis.</p>


<h3><a href="http://www.swingly.com/">Swingly</a></h3>

<p>
&quot;Amazing natural language processing, semantic search, and question-answering technology.  We have built the largest index of questions and answers ever created:  we know the answer to more than 10 billion questions.&quot; 
</p>

<h3><a href="http://www.talend.com/products-data-integration/talend-products.php">Talend</a></h3>
<p>
&quot;Data management encompasses all measures implemented to support the use of data as a resource. The purpose of data management is to manage and supply accurate and timely data to business processes. Major disciplines in Data Management include data integration, data quality, Master Data Management, etc...All Talend products are built on a unified Eclipse-based development environment, which provides users with consistent ergonomics, fast learning curve and a high-level of reusability.&quot; 
</p>

<h3><a href="http://www.talis.com/">Talis</a></h3>
<p>
&quot;The Talis Platform makes it easy for developers to build powerful applications that use Semantic Web technologies and standards.

Delivered as Software as a Service (SaaS), the Platform dramatically reduces the complexity and cost of storing, indexing, searching and augmenting large quantities of data. &quot;
</p>

<h3><a href="http://www.temis.com">Temis</a></h3>
<p>
&quot;TEMIS develops and markets corporate Text Mining solutions.  Our software unlocks knowledge from unstructured data.&quot;
</p>
<p>
Temis's <a href="http://www.temis.com/index.php?id=59&amp;selt=1">core products</a> include an
&quot;information extraction server dedicated to the analysis of text documents, a hierarchical clusterer that &quot;proposes the most relevant classification for a given document collection&quot;,
a classifier that &quot;classifies unstructured documents into pre-defined categories, combining statistical and linguistic analysis rules&quot;.  This is all based on &quot;XeLDA&quot;, their
&quot;multilingual linguistic engine&quot;.  They have an impressive <a href="http://www.temis.com/index.php?id=152&amp;selt=1">list of clients</a>.
</p>

<h3><a href="http://www.textanalysis.com/index.html">Text Analysis International</a></h3>
<p>
&quot;VisualText is the premier integrated development environment for
building information extraction systems, natural language processing
systems, and text analyzers.&quot;
</p>
<p>
TAI offers <a
href="http://www.textanalysis.com/Products/products.html">VisualText</a>,
&quot;an Integrated Development Environment for deep text analysis
applications.  Think of it as Visual C++ for Natural Language
Processing applications.  They also provide <a
href="http://www.textanalysis.com/Apps/Natural_Language_Processing/natural_language_processing.html">TAIParse</a>,
which includes part-of-speech tagging and noun-phrase chunking.  The
basic technology appears to be a <a href="http://www.textanalysis.com/tai-multi2003.pdf">multi-pass rule-based approach</a>.
</p>

<h3><a href="http://www.textkernel.com/">Textkernel</a></h3>
<p>
Textkernel offers resume parsing, info extraction, sentiment, and general data mining.
</p>

<h3><a href="http://www.textmap.com/">TextMap</a></h3>
<p>
&quot;TextMap is a search engine for entities: the important (and not so important)people, places, and things in the news.&quot;.
</p>
<p>
&quot;TextMap analyzes both the temporal and geographical distribution of news entities.&quot;
</p>
<p>
&quot;TextMap uses natural language processing techniques to track entity references in news sources, and a variety of statistical techniques to analyze the relationships between them.&quot;
</p>

<h3><a href="http://www.textore.net/index.html">TextOre</a></h3>
<p> 
&quot;We provide business-to-business (B2B) analytical software
and services to accurately examine and extract information from large
volumes of unstructured text.&quot;
</p>
<p> 
&quot;TextOre has the ability to perform searches that are highly
detailed, using multiple queries and in multiple languages, while
providing easily understood results. The results are provided
through an advanced visualization profile tool that identifies and
visually depicts the intensity of relationships in unstructured data
sources (letters, documents, e-mail and web pages), including
real-time news and information feeds. Our technology not only
identifies anomalies missed by competitive technologies, but also
identifies specific sentences, paragraphs and relationships, taking
into account the precise terms applied by a user.&quot; 
</p>

<h3><a href="http://www.textwise.com/">TextWise</a></h3>
<p> 
&quot;TextWise energizes your existing advertising portfolio by
offering high-resolution targeting, sophisticated media placement, and
hassle-free automation of both ad creation and placement.&quot;
</p>
<p> 
&quot;Semantic
Signatures are TextWise's patented contextual targeting
technology. They innovate beyond simple keyword-based or
category-based models currently used in so-called "contextual
advertising" and deliver a new level of context-driven advertisment
matching.&quot; They also
&quot;capture meaning through concepts, not keywords -- including multiple meanings and topics within a single document.&quot;
</p>

<a name="tns"></a>
<h3><a href="http://www.cymfony.com/About-Us/About-U">TNS Media Intelligence/Cymfony</a></h3>
<p> 
&quot; Cymfony, a division of TNS Media Intelligence, is a market
influence analytics company that sifts and interprets the millions of
voices at the intersection of traditional and social media such as
blogs and social networks to gain consumer insight and develop
stronger bonds with influencers.&quot;
</p>
<p> &quot;Cymfony's core is an advanced information extraction engine
that combines information retrieval and Natural Language Processing
(NLP) technologies to identify important people, places, companies,
concepts, relationships and events in documents.&quot; From the web
site, this looks like the latest version of Cymfony's &quot;InfoXtract
Engine&quot;.</p>
<p>
Cymfony spun off the government systems business to form
<a href="#janya">Janya</a>.
</p>

<h3><a href="http://www.trifeed.com/">Trifeed</a></h3>
<p>
&quot;In a fully automated process BullDoc(tm) server will crawl your organization resources (shared directories, submitted emails, specific web sites), feed them to the information extraction engine that will save the extracted data into the database....The system comes with plug-ins for many applications (MSWord, outlook, numerous web browsers) that enable the user to view the documents/emails/web pages in the way he use to, only gives her the ability to browse and navigate within a document to the relevant information.
&quot;
</p>

<h3><a href="http://www.vantagelinguistics.com/">Vantage Linguistics</a></h3>
<p>
&quot;As a world leader in the development of linguistic software
solutions, Vantage Linguistics continues to set the benchmark for
innovation and excellence in language-based research and artificial
intelligence.&quot;
</p>
<p>
Vantage offers a <a href="http://www.vantagelinguistics.com/products/">range of products</a>, including language identifiers, spell checkers, grammar checkers, and linguistically informed search.
</p>


<h3><a href="http://www.viewpoints.com/">Viewpoints Network</a></h3>

<p>&quot;Viewpoints Network is a social technology and media company focused on helping consumers make smarter decisions. We specialize in building communities and motivating &quot;social influencers&quot; to share their experiences by writing reviews, blog posts, how to guides, participating in discussion boards and contributing and voting on ideas. We then help organize and present those contributions to help other consumers make well informed purchase decisions.&quot;</p>


<h3><a href="http://www.visibletechnologies.com/trupulse.html">Visible Technologies</a></h3>

<p>Doing sentiment and phrase mining over news feeds, truPULSE &quot;helps you keep pace with the incredible speed and vast volume of social media conversations via an easy-to-use, RSS feed-based Web monitoring application. With truPULSE, organizations of all sizes can quickly and easily begin listening to and assessing online conversations about their brand.&quot;</p>


<h3><a href="http://www.wordtracker.com/"> Wordtracker</a></h3>

<p>&quot;Wordtracker's leading-edge research tool gives you the keywords you need to rise above your competitors in search engine rankings. Even better, we
also show you how keyword research can help you discover untapped market niches, get inspiration for new products, and create compelling content that distinguishes your site from the pack.&quot;</p>


<h3><a href="http://www.xrce.xerox.com/Research-Development/Document-Content-Laboratory">Xerox European Research Centre</a></h3>
<p>
&quot;With the multiplication of on-line document repositories and the
phenomenal growth of the Web, a fantastic amount of information is
available at our fingertips. The central problem becomes that of
quickly accessing, within that mass, the arbitrary pieces of
information that are needed at any given time. As a large proportion
of the data is made up of natural language texts, any comprehensive
solution will rely heavily on natural language processing (NLP). Our
research agenda concerns theories, methods, tools and systems that
make it possible to uncover the content of natural language
texts.&quot;
</p>
<p>
XRCE provides demos and licenses for software for finite state automat,
machine learning for categorization and clustering, robust parsing
and semantics.  You can find some online demos and links to research
software from the above link.
</p>

<h3><a href="http://yooname.com/">YooName</a></h3>

<p> &quot;YooName is Named Entity Recognition software based on
semi-supervised learning.  It identifies nine named entity categories
that are split into more than 100 sub-categories.&quot;</p>

<p>&quot;The YooName database and rule system are built using
semi-supervised learning techniques.&quot;</p>


<h3><a href="http://www.zoominfo.com/">ZoomInfo</a></h3>
<p>
<a href="http://www.zoominfo.com/About/products/zoominfo.aspx">ZoomInfo.com</a> &quot;is the premier business information search engine, with
profiles on more than 35 million people and 3.8 million
companies. ZoomInfo delivers a single site for quick and easy access
to in-depth information on industries, companies, people, products,
services and jobs.&quot;  &quot;ZoomInfo, a semantic search engine, uses its patented Natural Language Processing algorithms to understand and organize the business web.&quot;
</p>
<p>
ZoomInfo focuses on search for people, companies or jobs
on <a href="http://www.zoominfo.com">zoominfo.com</a>.
</p>


<h3><a href="http://www.zylab.com/Technology/text_mining_and_analytics.html">ZyLAB</a></h3>

<p>ZyLAB is in e-discovery, and offers general text data mining, including
entity extraction and summarization.  They also do visualization and MT.</p>


<a name="other"></a>
<h2>Lists of Tools and Corpora</h2>

<p>Lots of other groups have put together lists like this.  They contain
many links to one-off packages and many lists
almost all more comprehensive on the one-off packages (like Adwait
Ratnaparkhi's tagger, Michael Collins's parser, Eric Brill's tagger,
the YamCha SVM tagger, the Cambridge-CMU language toolkit, etc.)
</p>
<ul>
<li><a href="http://www-nlp.stanford.edu/links/statnlp.html">Stanford's Stat NLP/Corpus List</a></li>
<li><a href="http://www.ling.ohio-state.edu/~dickinso/corpus.html">Markus Dickinson's Corpora and Corpus Annotation List</a></li>
<li><a href="http://mallet.cs.umass.edu/index.php/Similar_software">MALLET's Similar Software Page</a></li>
<!-- <li><a href="http://compbio.uchsc.edu/corpora/bcresources.html">Alex Morgan's BioNLP Resource Page</a></li> -->
<li><a href="http://textanalytics.wikidot.com/">Text Analytics Wiki</a></li>
</ul>

</div><!-- content -->



<div id="foot">
<p>
&#169; 2003&ndash;2011 &nbsp;
<a href="mailto:lingpipe@alias-i.com">alias-i</a>
</p>
</div>
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
try {
var pageTracker = _gat._getTracker("UA-15123726-1");
pageTracker._trackPageview();
} catch(err) {}</script></body>
</html>


